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1 Introduction 


We seek a computational framework for detecting boundaries or edges present in gray level 
images. We are guided by two notions from psychophysics espoused by Koenderink and 
van Doom [1] [2]: 


• Primate visual function can be modeled by the activities of locally oriented recep- 
tive fields which are the second, third, and possibly fourth order derivatives of the 

Gaussian of scale t, <j> 0 (t) = exp - 4jrt 4 ' — 

• In the visual system the natural coordinates on the retina, £,tj are locally oriented 
by the direction of the gradient of the image smoothed by (j>o(t). 


The activities of the receptive fields are given by convolving the image with the set 
of receptive fields at each point in the image, and from this collection of activities at 
each image point the local geometry is computed. These activities give a convenient 
representation of the image irradiance and are used to formulate an edge detector. 

With the observation that the activities can be related to the Taylor series expansion 
of the image irradiance, I, about any point in the image we can then give mathematical 
forms - locally oriented derivatives -for local properties used to model edges and other 
features. 


2 Mathematical Formalism 


The receptive fields are denoted <j> i, <^22> <£111) <f> 112, • • •, ^222 and computed by 

^i-i2- "2(f> v) — where there are m l’s and n 2’s in the subscript of <j>. 
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While the receptive fields are defined over the infinite plane in actual use we choose a finite 
support size, W x W, large enough so that the receptive field is very small at the window 
edge. In Fig. 1 the receptive fields shown are for t = 2 with W = 21 which give a value of 
10~ 4 or smaller for the ratio of the value at the window edge to the maximum value over 
the entire window. 

The receptive fields can be used to compute a finite Taylor series expansion of the 
smoothed image at each retinal cell, called the jet. The subscripts 1 and 2 denote directions 
in the local coordinate system along a level contour and along the gradient respectively 
where the contour direction is given by the convention = e n x e/ where e f , e n , e/ are unit 
vectors in the direction of increasing £,rj,I and the cross product follows the right hand 
rule convention. Fig. 2 shows the coordinate convention. The various receptive fields are 
named by their subscripts with the convention that the number of l’s is the order of the 
derivative in the contour direction and the number of 2’s the order of the derivative in the 
gradient direction. The total number of subscripts is the order of the receptive field. 

In real images noise and quantitization errors may prevent the Taylor series from being 
well defined of the image at point ( x,y ). However, if we use the derived image , I 0 <f> 0 , 
given by smoothing the Image with (f > 0 , then we may expand about (x,y) to third order to 
get 
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where all the derivatives and [I ® d> 0 ] are evaluated at the point (x, y). 
For the convolution operator we have 
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where <f> w is shorthand for <^i...i 2 - 2 (^ 5 h)- This shows that we may compute activities 
of the receptive fields and thereby the Taylor series by performing convolutions with the 
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receptive fields. At this point we have made no commitment to the orientation of the local 
coordinates f , 77 . 


3 Choosing the Local Coordinates 


Suppose then as it is proposed by Koenderink [2] that the biological visual system makes 
no com m itment as to the coordinate system it will use, but rather chooses to use all 
coordinate systems. This it does by measuring the receptive fields activities over many 
directions — tt < 9 < tt. Thus the approximate continuum of quantities (as a function of 8) 


a r %,{0) = 7® w) 

is available to the low level vision system. Here w is the receptive field name and Re(-) 
is an operator that rotates its argument through an angle 6. Assuming the image varies 
smoothly, the maxima and zeroes of these activities define locally meaningful directions. 
For example, at a particular image point, the angle 6 for which a 2 ot (0) is a maximum 
defines a local direction with respect to the image x direction that is along the contour, 
the 1 direction, at that image point. Note for this angle that a[ ot (9) = 0 . Similarly, at a 
particular image point, the angle O' for which a[ ot (8') is a maximum defines a local direction 
with respect to the image x direction that is along the gradient, the 2 direction, at that 
point. 

The activities of the receptive fields incorporate the local geometry and provide a 
useful representation in terms of which to formulate edge detectors. Unlike biological 
vision, machine vision is typically presented with a much sparser set of activities - those 
activities of receptive fields defined by derivatives along the image x and y directions. We 
make contact with biological vision by defining the local coordinate 2 to be in the direction 
8 ioc = tan~ l (I ® //0 and we then compute the activities of the receptive fields 

in this particular choice of local coordinates: 

° r w — I ® R e loc {<t>'w)- 


4 Edge Detection and Local Features 


A simple edge detector finds candidate edges points as those points where the gradient is 
a local maximum. The Canny edge detector [3], [4] in doing this pays particular attention 
to accurately finding the direction of the gradient at each pixel and then to doing a careful 
interpolation of the change in gradient along this direction so as to find its local maximum. 

If the image varies smoothly this is equivalent to locating the zero crossing in the change 
of the gradient along the gradient direction. In local geometry this is a\ ot = 0 , a 2 ot > 
0 , a 2 ° 2 = 0 , which means respectively: align 1 along the contour direction, consider only 
points with nonzero gradient, locate the edge at the zero crossing of a r 22 . 
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For comparison, the Marr-Hildreth edge detector seeks zero crossings of the Laplacian 
of a Gaussian [5] and is given by a[ ^ + a = 0. This is rotationally invariant indicating 
that it contains less local geometry than the Canny detector and thereby can be expected 
not to perform as well as the Canny detector. 

Besides exploiting properties in the gradient direction we can compute properties along 
the contour direction. Any contour of the Image irradiance satisfies implicitly the equation 
I(x,y ) ® 4>o = I 0 , where I 0 is value of the smoothed irradiance that defines the contour. 
Along the contour in image coordinates we have 


d_ 
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Since I ® (f > o is constant on the contour we have (^| c) n I ® </>o = 0, for all n . In particular 
we have for n = 1 and n = 2, respectively 
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If we take the local coordinate system so that the contour lies along the 1 direction, then 
^|| c = 0, a\ ot — 0, and ^r| c = — ojf/flj 04 which is the curvature along the contour. 

Another useful result is obtained by setting ^| c = 0 in -^s\ c I ® <t>o = 0 which leads 

to a, T iiia r 2 ot — a result given by Koenderink and van Doom [1]. Points that 

satisfy this criterion are called ridges and can be identified with corner points along edge 
directions. 


5 Accuracy of the Representation 


In accordance with the notion that the local geometry accurately represents the local image 
structure we expect it should be possible to locate edges to sub-pixel resolution. This 
approach uses the local geometry to model a smoothed representation of the image. Thus 
the gradient directions are not quantitized by the original pixel lattice (angles quantitized 
to fall into multiples of 0 < 6 < 7t/4), but are accurately given to a fraction of a degree. 
Similarly we expect that the location of zero crossings to be given to a finer resolution 
than an individual pixel. 

We have tested this hypothesis by locating a zero crossing along a given direction using 
an interpolation given by Canny [4|. In Fig. 3 a discretely sampled function h, with 
0 < 6 < 7 r /4 is given by values h[i,j ), h(i + 1, j), and h[i + 1, j + 1) with h(i,j ) < 0, and 
h[i + 1 ,j),h(i + l,y + 1) > 0. The value h xnt = (1 - tan6)h(i + l,j) + tan0h(i + 1 ,j+ 1) 


112 



is the linear interpolation of h in the direction 6, and the zero crossing is located at a 
distance d = 1 + tan 2 6)^ / (h tnt — h(i,j)). Note in this case the zero crossing falls 

outside the pixel Similar interpolation formulas hold for other directions of 6. 

In what follows we calculate for each candidate edge pixel the values 6 and d. To display 
the sub-pixel location of the edges we have found, the pixel in which the zero crossing point 
falls is dilated by a factor of 5 so that each original pixel is equivalent to 25 sub-pixels. 
The edge is then drawn as a digital lint so as to pass through the zero crossing at an angle 
perpendicular to 6. The digital line marks only those sub-pixels that contain the line. We 
do not extend the line outside the original pixel which contains the zero crossing. This 
we consider a crude approximation, but the results below bear out the claims of sub-pixel 
accuracy. 


6 Results 


We have constructed the local geometry of a simple synthetic image of a rectangle, of 
step edge with additive Gaussian noise, and of a SAR image (substantially subsampled to 
remove speckle) taken from SEASAT of ice floes. 


Synthetic Image of a Rectangle 

For the synthetic image in Fig. 4 the various activities of the receptive fields locate local 
properties in the image. The zero crossings of the activities of the receptive fields a^f 
where a™ 1 > 0 can be seen to locate the edges of the object while the zero crossings of 
« — a r -^o\i , the ridgt detector locate the corners. With so little structure in the 
image it is difficult to give further meaning to the other receptive field activities. The 
edges found lie to sub-pixel accuracy exactly along the rectangle sides while at the corners 
they are rounded reflecting the effect of the smoothing by convolution with <f>Q. 


Noisy Step Edges 

We created synthetic noisy step edges by adding Gaussian random noise of zero mean and 
variance one to step edges of varying height. Images were then scaled to the grey scale 
range of 0 to 255. Defining the signal to noise ratio (SNR) [6] as the ratio of the square 
step height to the variance of the Gaussian noise we have found, see Fig. 5, that for SNR 
< 1 the edge becomes broken while for SNR > 1 the edge is continuous. Here the window 
size and receptive field sizes are, W = 21, and t = 2. In addition zero crossings of a 2 2 are 
considered only if the values of a 2 exceed 0.1 max(a 2 ). 

As can be seen the edge wanders about the true edge but remains smooth and contin- 
uous. Roughly then , when the SNR excceeds 2, we expect this edge detector to perform 
reasonably. 
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Ice Image 


The ice image in Fig. 6 is a 256 X 256 grey scale image dilated in size from a 128 x 128 
original image. This was processed using receptive fields of size W = 21 and t — 2. The 
dilation was done to reduce numerical errors that would have resulted from using receptive 
fields of size W — 11 and t = 1 on the original 128 x 128 image. As for the noisy step 
edges, zero crossings of g 2 2 are considered only if the values of a 2 exceed 0.1 max(a 2 ). 

Fig. 6 shows 1,1® <f> 0 , a r 2 0t , a 2 ° 2 in the top four images and edges located to one 
pixel accuracy in the lower left. In the lower right are shown the values of d, the distance 
of the zero crossing from the center of the found edge pixels, coded by intensity with 
high intensity corresponding to larger d. The large variation in d suggests that the local 
geometry contains more information that can be used to locate the edge, indeed to sub- 
pixel accuracy. 

To examine the sub-pixel accuracy we have enlarged the upper quadrant of the ice image 
in Fig 7. As can be seen, the edges found form a smooth almost continuous boundary to 
the butterfly shaped island and other regions in the image. We take this as evidence that 
the local geometry contains sufficient information to locate edges to a sub-pixel accuracy 
that increases resolution by a factor of five. 


7 Summary 


We have described a new representation, the local geometry, for early visual processing 
which is motivated by results from biological vision. This representation is richer than is 
often used in image processing. It extracts more of the local structure available at each 
pixel in the image by using receptive fields that can be continuously rotated and that 
go to third order in spatial variation. Early visual processing algorithms such as edge 
detectors and ridge detectors can be written in terms of various local geomtries and are 
computationally tractable. For example, Canny’s edge detector has been implemented in 
terms of a local geometry of order two, and a ridge detector in terms of a local geometry 
of order three. 

The edge detector in local geometry was applied to synthetic and real images and it 
was shown using simple interpolation schemes that sufficient information is available to 
locate edges with sub-pixel accuracy (to a resolution increase of at least a factor of five). 
This is reasonable even for noisy images because the local geometry fits a smooth surface 
-the Taylor series - to the discrete image data. 

Only local processing was used in the implementation so it can readily be implemented 
on parallel mesh machines such as the MPP [7]. We expect that other early visual algo- 
rithms, such as region growing, inflection point detection, and segmentation can also be 
implemented in terms of the local geomtry and will provide sufficiently rich and robust 
representations for subsequent visual processing. 
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Figure 1: Receptive Fields with W — 21 (window size) and t = 2 
arranged according to: 
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Figure 2: The local coordinates £,77 (also named 1 2) of the image at a point x,y. The 
surface shown is the image irradiance versus versus x, y. 
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Figure 5: Synthetic noisy step edges on the left and edges found on the right. Upper pair 
is for SNR = 1 and lower pair for SNR = 2. 


Figure 6: Results for the ice floe image according for W = 21, t 
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